EDK - Display16
---------------

The Display16 is used to display video frames. In reality, this job is either done by a TV set or a monitor.

Each VIC contains a pointer to a Display16 object. For initialisation, the Display16 provides the following functions:

  void SetResolution(int iWidth, int iHeight);
  void SetDefaultVisibleArea(int iLeft, int iTop, int iRight, int iBottom);
  void SetDefaultColors(byte aabColor[16][3]);

After the VIC has created its display with 'pDisplay = new Display16' and 'pDisplay->Init()', it must set the resolution of the bitmap. For example, a PAL VIC6569 will call pDisplay->SetResolution(63 * 8, 312).

The VIC may also call pDisplay->SetDefaultVisibleArea() in order to hide frame and overscan areas. If this call is omitted, the whole screen will be visible. The VIC shouldn't read the parameters of SetDefaultVisibleArea() from the configuration file because the user interface belongs to the Display16 only, or in reality, the TV.

As the last step of initialisation, the VIC must call pDisplay->SetDefaultColors(). The parameter is a two-dimensional array of 16 RGB colors. Red, green and blue range from 0..255. Again, the VIC shouldn't offer the colors to be changed by the user. The Display16 object itself will read the current settings from the configuration file and switch into B/W mode, if desired.

After the initialisation is complete, the VIC can feed the Display16 with data. The interface between the VIC and the Display16 is based on raster lines:

  byte* GetNextLine();
  void NextFrame();

Whenever the VIC is at the beginning of a new line, it calls pDisplay->GetNextLine(). This function returns a pointer to a buffer of iWidth bytes. The VIC will then fill that buffer with pixel data. Because the Display16 will use only the colors in the bits 0..3, the VIC may use the remaining bits 4..7 for border and sprite collision information. When the VIC has finished the last line of the current frame, it calls pDisplay->NextFrame().

A Display16 can operate in two modes, line based and frame based. In line based mode, the Display16 will always return the same pointer, and each call to pDisplay->GetNextLine() will copy the previous line into the video RAM. It depends on the video hardware if the line will be immediately visible or if it is just copied into an offscreen buffer. In frame based mode, the pointers returned by each call to pDisplay->GetNextLine() will differ. The whole frame is then either output at pDisplay->NextFrame() or the video hardware is set into a mode where it cuts the visible area by hardware.

Some considerations about Display16 performance:

Why not display one or eight pixels at once? Because, on the Pentium, calling a virtual function will add an overhead of 10 cycles:

  mov ECX,[ESI]VIC.pDisplay
  nop

  nop
  nop

  mov EAX,[ECX]Display16.__lpvtable
  nop

  nop
  nop

  ;two
  ;cycles

  ;unpaired
  call [EAX+FunctionIndex]

  push ESI
  mov ESI,ECX

  nop
  pop ESI

  ;two
  ;cycles

  ;unpaired
  ret

If the NOPs are replaced by other commands, the overhead will be reduced to 6.5 cycles, or 6.5% on a Pentium-100. This is still too much.

Why not display the whole frame in line based mode at once, too? Because, on the Pentium, the L1 cache is only 8 KB. Even if a frame completely fits into the L2 cache, MOVSD in Pipelined Burst L2 cache on the Pentium 75/90/100 is still slower by a factor of 6 compared to the internal L1 cache.

Why not let the VIC perform the display? This is fast, but looses flexibility and maintainabilty. A VIC656x which outputs directly to the screen would depend on the host operating system. If the OS changes, the VIC656x must be changed, too. Changing code costs time and may introduce new bugs. Even within Windows, there would have to be different versions of the VIC for DIBSection/WinG, 8, 16 and 32 bit DirectDraw at horizontal scaling factors of 1, 2, and 3, and DirectDraw fullscreen. This makes eleven combinations, or eleven different implementations of the VIC656x class. Unmaintainable code, or in other words, Spaghetti. And if someone writes a VIC20 emulator, the same mess will start over again.

That's why the Display16 uses a line based approach as a compromise between speed and flexibility. It costs about 6.5 cycles per line, or 6.5 / 64 = 0.1% on the Pentium-100.
